Test of Hypothesis, P Values and Related Concepts The Principle of the Hypothesis Test The principle is to formulate a hypothesis and an alternative hypothesis, H 0 H_0 H 0 and H 1 H_1 H 1 respectively, and then select a statistic with a given distribution when H 0 H_0 H 0 is true and select a rejection region which has a specified probability ( α ) (\alpha) ( α ) when H 0 H_0 H 0 is true.
The rejection region is chosen to reflect H 1 H_1 H 1 , i.e. to ensure a high probability of rejection when H 1 H_1 H 1 is true.
Examples Flip a coin to test
H 0 : P = 1 2 H_0: P = \displaystyle\frac {1} {2} H 0 : P = 2 1 vs H 1 : P ≠ 1 2 H_1: P \neq \displaystyle\frac {1} {2} H 1 : P = 2 1 \ Reject, if no heads or all heads are obtained in 6 trials, where the error rate is
P [ Reject H 0 when true ] = P [ All heads or all tails ] = P [ All heads ] + P [ All tails ] = 1 2 6 + 1 2 6 = 2 ⋅ 1 64 = 1 32 < 0.05 \begin{aligned} P[ \text{Reject } H_0 \text{ when true}] &= P [\text{All heads or all tails}] \\ &= P[\text{All heads}] + P[\text{All tails}] \\ &= \displaystyle\frac {1} {2^6} + \displaystyle\frac {1} {2^6} \\ &= 2 \cdot \displaystyle\frac {1} {64} \\ &= \displaystyle\frac {1} {32} \\ &< 0.05 \end{aligned} P [ Reject H 0 when true ] = P [ All heads or all tails ] = P [ All heads ] + P [ All tails ] = 2 6 1 + 2 6 1 = 2 ⋅ 64 1 = 32 1 < 0.05 A variation of this test is called the sign test, which is used to test hypothesis of the form,
H 0 H_0 H 0 : true median equals zero using a count of the number of positive values.
The One Sided z z z -test for normal mean Consider testing
H 0 : μ = μ 0 H_0: \mu = \mu_0 H 0 : μ = μ 0
vs
H 1 : μ > μ 0 H_1: \mu > \mu_0 H 1 : μ > μ 0
Where data x 1 , … , x n x_1, \ldots, x_n x 1 , … , x n are collected as independent observations of X 1 , … , X n ∼ N ( μ , σ 2 ) X_1, \ldots,X_n \sim N(\mu, \sigma^2) X 1 , … , X n ∼ N ( μ , σ 2 ) and σ 2 \sigma^2 σ 2 is known.
If H 0 H_0 H 0 is true, then
x ˉ ∼ N ( μ 0 , σ 2 n ) \bar {x} \sim N(\mu_0, \displaystyle\frac{\sigma^2}{n}) x ˉ ∼ N ( μ 0 , n σ 2 )
So,
Z = x ˉ − μ 0 σ n ∼ N ( 0 , 1 ) Z = \displaystyle\frac{\bar {x} - \mu_0}{\displaystyle\frac{\sigma} {\sqrt{n}}} \sim N(0,1) Z = n σ x ˉ − μ 0 ∼ N ( 0 , 1 )
It follows that,
P [ Z > z ∗ ] = α P[Z>z^\ast] = \alpha P [ Z > z ∗ ] = α
Where
z ∗ = z 1 − α z^\ast = z_{1-\alpha} z ∗ = z 1 − α
So if the data x 1 , … , x n x_1, \ldots,x_n x 1 , … , x n are such that,
z = x ˉ − μ 0 σ n > z ∗ z = \displaystyle\frac{\bar {x} - \mu_0}{\displaystyle\frac{\sigma} {\sqrt{n}}} > z^\ast z = n σ x ˉ − μ 0 > z ∗
Then H 0 H_0 H 0 is rejected.
Examples Consider the following data set: 47 , 42 , 41 , 45 , 46 47, 42, 41, 45, 46 47 , 42 , 41 , 45 , 46 .
Suppose we want to test the following hypothesis
H 0 : μ = 42 H_0 : \mu = 42 H 0 : μ = 42
vs
H 1 : μ > 42 H_1 : \mu > 42 H 1 : μ > 42
where σ = 2 \sigma = 2 σ = 2 is given.
The mean of the given data set can be calculated as
x ˉ = 44.2 \bar {x} = 44.2 x ˉ = 44.2
we can calculate z z z by using following equation
z = x ˉ − μ σ n = 44.2 − 42 2 5 = 2.2 0.8944 = 2.459 \begin{aligned} z &= \displaystyle\frac{\bar {x} - \mu}{\displaystyle\frac{\sigma} {\sqrt{n}}} \\ &= \displaystyle\frac{44.2 - 42}{\displaystyle\frac{2} {\sqrt{5}}} \\ &= \displaystyle\frac{2.2}{0.8944} \\ &= 2.459 \end{aligned} z = n σ x ˉ − μ = 5 2 44.2 − 42 = 0.8944 2.2 = 2.459 Here α = 0 , 05 \alpha = 0,05 α = 0 , 05 so we have that
z ∗ = 1.645 z^\ast = 1.645 z ∗ = 1.645
We obtain that 2 , 459 > 1 , 645 2,459 > 1,645 2 , 459 > 1 , 645 , i.e. z > z ∗ z> z^\ast z > z ∗ and so H 0 H_0 H 0 is rejected with α = 0.05 \alpha = 0.05 α = 0.05
The Two-sided z z z -test for a normal mean z : = x ‾ − μ 0 s n ∼ N ( 0 , 1 ) z: =\displaystyle\frac{\overline{x}-\mu_0}{s\sqrt{n}} \sim N(0,1) z := s n x − μ 0 ∼ N ( 0 , 1 )
Details Consider testing H 0 : μ = μ 0 H_0: \mu=\mu_0 H 0 : μ = μ 0 versus H 1 : μ ≠ μ 0 H_1: \mu \ne \mu_0 H 1 : μ = μ 0 based on observation from X 1 ‾ , … , X ‾ ∼ N ( μ , σ 2 ) \overline{X_1},\dots, \overline{X} \sim N(\mu, \sigma^2) X 1 , … , X ∼ N ( μ , σ 2 ) independent and identically distributed where σ 2 \sigma^2 σ 2 is known.
If H 0 H_0 H 0 is true, then
Z : = x ‾ − μ 0 σ n ∼ N ( 0 , 1 ) Z: = \displaystyle\frac{\overline{x}-\mu_0}{\sigma \sqrt{n}} \sim N(0,1) Z := σ n x − μ 0 ∼ N ( 0 , 1 )
and
P [ ∣ z ∣ > z ∗ ] = α P[|z| > z^\ast] = \alpha P [ ∣ z ∣ > z ∗ ] = α
with
z ∗ = z 1 − α z^\ast = z_{1-\alpha} z ∗ = z 1 − α
We reject H 0 H_0 H 0 if ∣ z ∣ > z ∗ |z| > z^\ast ∣ z ∣ > z ∗ .
If ∣ z ∣ > z ∗ |z| > z^\ast ∣ z ∣ > z ∗ is not true, then we cannot reject the H 0 H_0 H 0 .
Examples In R, you may generate values to calculate the z z z value.
The command that is generally used is: quantile
To illustrate:
z<-rnorm(1000,0,1) quantile(z,c(0.025,0.975)) 2.5% 97.5% -1.995806 2.009849
So, the z z z value for a two-sided normal mean is ∣ − 1.99 ∣ \left |-1.99 \right | ∣ − 1.99 ∣ .
The One-sided T-test for a Single Normal Mean Recall that if X 1 , … , X n ∼ N ( μ , σ 2 ) X_1,\dots,X_n \sim N(\mu,\sigma^2) X 1 , … , X n ∼ N ( μ , σ 2 ) independent and identically distributed then
X ‾ − μ S / n ∼ t n − 1 \displaystyle\frac{\overline{X}-\mu}{S/\sqrt{n}}\sim t_{n-1} S / n X − μ ∼ t n − 1
Details Recall that if X 1 , … , X n ∼ N ( μ , σ 2 ) X_1,\ldots,X_n \sim N(\mu,\sigma^2) X 1 , … , X n ∼ N ( μ , σ 2 ) independent and identically distributed then
X ‾ − μ S / n ∼ t n − 1 \displaystyle\frac{\overline{X}-\mu}{S/\sqrt{n}}\sim t_{n-1} S / n X − μ ∼ t n − 1
To test the hypothesis H 0 : μ = μ 0 H_0:\mu=\mu_{0} H 0 : μ = μ 0 vs H 1 : μ > μ 0 H_1:\mu > \mu_{0} H 1 : μ > μ 0 first note that if H 0 H_0 H 0 is true, then
T = X ‾ − μ 0 S / n ∼ t n − 1 T= \displaystyle\frac{\overline{X}-\mu_{0}}{S/\sqrt{n}} \sim t_{n-1} T = S / n X − μ 0 ∼ t n − 1
so
P [ T > t ∗ ] = α P[T>t^{\ast}]=\alpha P [ T > t ∗ ] = α
if
t ∗ = t n − 1 , 1 − α t^{\ast}=t_{n-1,1-\alpha} t ∗ = t n − 1 , 1 − α
Hence, we reject H 0 H_0 H 0 if the data x 1 , … , x n x_1,\dots,x_n x 1 , … , x n results in a a value of t : = x ‾ − μ 0 S / n t:=\displaystyle\frac{\overline{x}-\mu_0}{S/\sqrt{n}} t := S / n x − μ 0 such that t > t ∗ t>t^{\ast} t > t ∗ , otherwise H 0 H_0 H 0 cannot be rejected.
Examples Suppose the following data set (12,19,17,23,15,27) comes independently from a normal distribution and we need to test H 0 : μ = μ 0 H_0:\mu=\mu_0 H 0 : μ = μ 0 vs H 1 : μ > μ 0 H_1:\mu>\mu_0 H 1 : μ > μ 0 .
Here we have n = 6 , x ‾ = 18.83 , s = 5.46 , μ 0 = 18 n=6,\overline{x}=18.83, s=5.46, \mu_0=18 n = 6 , x = 18.83 , s = 5.46 , μ 0 = 18
so we obtain
t = x ‾ − μ 0 s / n = 0.37 t=\displaystyle\frac{\overline{x}-\mu_0}{s/\sqrt{n}}= 0.37 t = s / n x − μ 0 = 0.37
so H 0 H_0 H 0 cannot be rejected
In R, t ∗ t^{\ast} t ∗ is found using qt(n-1,0.95)
but the entire hypothesis can be tested using
t.test(x,alternative="greater",mu=<18)
Comparing Means from Normal Populations Suppose data are gathered independently from two normal populations resulting in x 1 , … , x n x_1,\dots,x_n x 1 , … , x n and y 1 , … y m y_1,\dots y_m y 1 , … y m
Details We know that if
X 1 , … , X n ∼ N ( μ 1 , σ ) X_1, \dots, X_n \sim N(\mu_1,\sigma) X 1 , … , X n ∼ N ( μ 1 , σ )
Y 1 , … , Y m ∼ N ( μ 2 , σ ) Y_1, \dots, Y_m \sim N(\mu_2,\sigma) Y 1 , … , Y m ∼ N ( μ 2 , σ )
are all independent then
X ˉ − Y ˉ ∼ N ( μ 1 − μ 2 , σ 2 n + σ 2 m ) \bar{X}-\bar{Y} \sim N(\mu_1-\mu_2,\displaystyle\frac{\sigma^2}{n}+\displaystyle\frac{\sigma^2}{m}) X ˉ − Y ˉ ∼ N ( μ 1 − μ 2 , n σ 2 + m σ 2 )
Further,
∑ i = 1 n ( X i − X ˉ ) 2 σ 2 ∼ X n − 1 2 \displaystyle\sum_{i=1}^{n} \displaystyle\frac{(X_i-\bar{X})^2}{\sigma^2} \sim X_{n-1}^{2} i = 1 ∑ n σ 2 ( X i − X ˉ ) 2 ∼ X n − 1 2
and
∑ j = 1 m ( Y j − Y ˉ ) 2 σ 2 ∼ X m − 1 2 \displaystyle\sum_{j=1}^{m} \displaystyle\frac{(Y_j-\bar{Y})^2}{\sigma^2} \sim X_{m-1}^{2} j = 1 ∑ m σ 2 ( Y j − Y ˉ ) 2 ∼ X m − 1 2
so
Σ i = 1 n ( X i − X ˉ ) 2 + Σ j = 1 m ( Y j − Y ˉ ) 2 σ 2 ∼ X n + m − 2 2 \displaystyle\frac {\Sigma_{i=1}^{n}(X_i-\bar{X})^2 + \Sigma_{j=1}^{m}(Y_j-\bar{Y})^2}{\sigma^2} \sim X_{n+m-2}^2 σ 2 Σ i = 1 n ( X i − X ˉ ) 2 + Σ j = 1 m ( Y j − Y ˉ ) 2 ∼ X n + m − 2 2
and it follows that
X ˉ − Y ˉ − ( μ 1 − μ 2 ) S ( 1 n + 1 m ) ∼ t n + m − 2 \displaystyle\frac {\bar{X}-\bar{Y}-(\mu_1-\mu_2)}{S\sqrt{(\displaystyle\frac{1}{n}+\displaystyle\frac{1}{m})}} \sim t_{n+m-2} S ( n 1 + m 1 ) X ˉ − Y ˉ − ( μ 1 − μ 2 ) ∼ t n + m − 2
where
S = Σ i = 1 n ( X 1 − X ˉ ) 2 + Σ j = 1 m ( Y j − Y ˉ ) 2 n + m − 2 S=\sqrt{\displaystyle\frac{\Sigma_{i=1}^{n}(X_1-\bar{X})^2+\Sigma_{j=1}^{m}(Y_j-\bar{Y})^2}{n+m-2}} S = n + m − 2 Σ i = 1 n ( X 1 − X ˉ ) 2 + Σ j = 1 m ( Y j − Y ˉ ) 2
consider testing H 0 : μ 1 = μ 2 H_0:\mu_1=\mu_2 H 0 : μ 1 = μ 2 vs H 1 = μ 1 > μ 2 H_1=\mu_1>\mu_2 H 1 = μ 1 > μ 2 .
Hence, if H 0 H_0 H 0
is true then the observed value
t = x ˉ − y ˉ S 1 n + 1 m t=\displaystyle\frac{\bar{x}-\bar{y}}{S\sqrt{\displaystyle\frac{1}{n}+\displaystyle\frac{1}{m}}} t = S n 1 + m 1 x ˉ − y ˉ
comes from a t t t -test with n + m − 2 n+m-2 n + m − 2 df and we reject H 0 H_0 H 0 if ∣ t ∣ > t ∗ \left|t\right|>t^\ast ∣ t ∣ > t ∗ .
Here,
S = ∑ i ( x i − x ˉ ) 2 + ∑ j ( y j − y ˉ ) 2 n + m − 2 S=\sqrt{\displaystyle\frac{\displaystyle\sum_{i}(x_i-\bar{x})^2+\displaystyle\sum_{j}(y_j-\bar{y})^2}{n+m-2}} S = n + m − 2 i ∑ ( x i − x ˉ ) 2 + j ∑ ( y j − y ˉ ) 2
and t ∗ = t n + m − 2 , 1 − α t^\ast=t_{n+m-2,1-\alpha} t ∗ = t n + m − 2 , 1 − α
Comparing Means from Large Samples If X 1 , … X n X_1,\dots X_n X 1 , … X n and Y 1 , … Y m Y_1,\dots Y_m Y 1 , … Y m , are all independent (with finite variance) with expected values of μ 1 \mu_1 μ 1 and μ 2 \mu_2 μ 2 respectively, and variances of σ 1 2 \sigma_1^2 σ 1 2 ,and σ 2 2 \sigma_2^2 σ 2 2 respectively, then
X ‾ − Y ‾ − ( μ 1 − μ 2 ) σ 1 2 n + σ 2 2 m ∼ N ( 0 , 1 ) \displaystyle\frac{\overline{X}-\overline{Y}-(\mu_1-\mu_2)}{\sqrt{\displaystyle\frac{\sigma_1^2}{n}+\displaystyle\frac{\sigma_2^2}{m}}} \sim N(0,1) n σ 1 2 + m σ 2 2 X − Y − ( μ 1 − μ 2 ) ∼ N ( 0 , 1 )
if the sample sizes are large enough
This is the central limit theorem.
Details Another theorem (Slutzky) stakes that replacing σ 1 2 \sigma_1^2 σ 1 2 and σ 2 2 \sigma_2^2 σ 2 2 with S 1 2 S_1^2 S 1 2 and S 2 2 S_2^2 S 2 2 will result in the same (limiting) distribution
It follows that for large samples we can test
H 0 : μ 1 = μ 2 vs H 1 : μ 1 > μ 2 H_0: \mu_1=\mu_2 \qquad \text{vs} \qquad H_1:\mu_1 > \mu_2 H 0 : μ 1 = μ 2 vs H 1 : μ 1 > μ 2
by computing
z = x ‾ − y ‾ s 1 2 n + s 2 2 m z=\displaystyle\frac{\overline{x}-\overline{y}}{\sqrt{\displaystyle\frac{s_1^2}{n}+\displaystyle\frac{s_2^2}{m}}} z = n s 1 2 + m s 2 2 x − y
and reject H 0 H_0 H 0 if z > z 1 − α z>z_{1-\alpha} z > z 1 − α .
The P-value The p p p -value of a test is an evaluation of the probability of obtaining results which are as extreme as those observed in the context of the hypothesis.
Examples Consider a dataset and the following hypotheses
H 0 : μ = 42 H_0:\mu=42 H 0 : μ = 42
vs.
H 1 : μ > 42 H_1:\mu>42 H 1 : μ > 42
and suppose we obtain
z = 2.3 z=2.3 z = 2.3
We reject H 0 H_0 H 0 since
2.3 > 1.645 + z 0.95 2.3>1.645+z_{0.95} 2.3 > 1.645 + z 0.95
The p p p -value is
P [ Z > 2.3 ] = 1 − Φ ( 2.3 ) P[Z>2.3]= 1-\Phi(2.3) P [ Z > 2.3 ] = 1 − Φ ( 2.3 )
obtained in R using
1-pnorm(2.3) [1] 0.01072411
If this had been a two tailed test, then
P = P [ ∣ Z ∣ > 2.3 ] = P [ Z < − 2.3 ] + P [ Z > 2.3 ] = 2 ⋅ P [ Z > 2.3 ] \begin{aligned} P &= P[|Z|>2.3] \\ &= P[Z<-2.3]+P[Z>2.3] \\ &= 2\cdot P[Z>2.3] \end{aligned} P = P [ ∣ Z ∣ > 2.3 ] = P [ Z < − 2.3 ] + P [ Z > 2.3 ] = 2 ⋅ P [ Z > 2.3 ] The Concept of Significance Details Two sample means are statistically significantly different if the null hypothesis H 0 : μ 1 = μ 2 H_0:\mu_1 = \mu_2 H 0 : μ 1 = μ 2 , can be rejected .
In this case, one can make the following statements:
The population means are different. The sample means are significantly different. μ 1 ≠ μ 2 \mu_1 \ne \mu_2 μ 1 = μ 2 x ˉ \bar{x} x ˉ is significantly different from y ˉ \bar{y} y ˉ .But one does not say:
The sample means are different. The population means are different with probability 0.95 0.95 0.95 . Similarly, if the hypothesis H 0 : μ 1 = μ 2 H_0: \mu_1 = \mu_2 H 0 : μ 1 = μ 2 cannot be rejected, we can say:
There is no significant difference between the sample means. We cannot reject the equality of population means. We cannot rule out. But we cannot say:
The sample means are equal. The population means are equal. The population means are equal with probability 0.95 0.95 0.95 .